A Comparative Study of Two Procedures for Calculating Likelihood Ratio in Forensic Text Comparison: Multivariate Kernel Density vs. Gaussian Mixture Model-Universal Background Model
نویسنده
چکیده
We compared the performances of two procedures for calculating the likelihood ratio (LR) on the same set of text data. The first procedure was a multivariate kernel density (MVKD) procedure which has been successfully applied to various types of forensic evidence, including glass fragments, handwriting, fingerprint, voice, and texts. The second procedure was a Gaussian mixture model – universal background model (GMM-UBM), which has been commonly used in forensic voice comparison (FVC) with so-called automatic features. Previous studies have applied the MVKD system to electronically-generated texts to estimate LRs, but so far no previous studies seem to have applied the GMM-UBM system to such texts. It has been reported that the latter GMM-UBM system outperforms the MVKD system in FVC. The data used for this study was chatlog messages collected from 115 authors, which were divided into test, background and development databases. Three different sample sizes of 500, 1500 and 2500 words were used to investigate how the performance is susceptible to the sample size. Results show that regardless of sample size, the performance of the GMM-UBM system was better than that of the MVKD system with respect to both validity (= accuracy) (of which the metric is the log-likelihood-ratio cost, Cllr) and reliability (= precision) (of which the metric is the 95% credible interval, CI).
منابع مشابه
An Effect of Background Population Sample Size on the Performance of a Likelihood Ratio-based Forensic Text Comparison System: A Monte Carlo Simulation with Gaussian Mixture Model
This is a Monte Carlo simulation-based study that explores the effect of the sample size of the background database on a likelihood ratio (LR)-based forensic text comparison (FTC) system built on multivariate authorship attribution features. The text messages written by 240 authors who were randomly selected from an archive of chatlog messages were used in this study. The strength of evidence (...
متن کاملLikelihood Ratio Calculation in Acoustic-Phonetic Forensic Voice Comparison: Comparison of Three Statistical Modelling Approaches
This study compares three statistical models used to calculate likelihood ratios in acoustic-phonetic forensic-voicecomparison systems: Multivariate kernel density, principal component analysis kernel density, and a multivariate normal model. The data were coefficient values obtained from discrete cosine transforms fitted to human-supervised formant-trajectory measurements of tokens of /iau/ fr...
متن کاملTraditional Forensic Voice Comparison with Female Formants: Gaussian mixture model and multivariate likelihood ratio analyses
The first likelihood ratio-based forensic voice comparison on female voices, and the first forensic use of Gaussian mixture models on traditional features, are described. A GMM-UBM LR-based comparison is performed on the first three formants of the five long /monophthongs/ of 20 General Australian English female speakers in non-contemporaneous recordings separated by one to five weeks. Comparis...
متن کاملGaussian Mixture Models of Between-Source Variation for Likelihood Ratio Computation from Multivariate Data
In forensic science, trace evidence found at a crime scene and on suspect has to be evaluated from the measurements performed on them, usually in the form of multivariate data (for example, several chemical compound or physical characteristics). In order to assess the strength of that evidence, the likelihood ratio framework is being increasingly adopted. Several methods have been derived in or...
متن کاملProbabilistic Evaluation of SMS Messages as Forensic Evidence: Likelihood Ratio Based Approach with Lexical Features
This study is one of the first likelihood ratio-based forensic text comparison studies in forensic authorship analysis. The likelihood-ratio-based evaluation of scientific evidence has started being adopted in many disciplines of forensic evidence comparison sciences, such as DNA, handwriting, fingerprints, footwear, voice recording, etc., and it is largely accepted that this is the way to ensu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013